Skip to content

Fix aggte failure after pickle reload due to non-numeric column (Issue #71)#73

Merged
alexanderquispe merged 1 commit intod2cml-ai:mainfrom
gsaco:issue71
Jan 13, 2026
Merged

Fix aggte failure after pickle reload due to non-numeric column (Issue #71)#73
alexanderquispe merged 1 commit intod2cml-ai:mainfrom
gsaco:issue71

Conversation

@gsaco
Copy link
Collaborator

@gsaco gsaco commented Jan 13, 2026

Resolution of Issue #71

This pull request fixes Issue #71, where aggte() fails after reloading a csdid result from a pickle file. The attached notebook provides a minimal replication of the failure and demonstrates the corrected behavior.

aggte_rowid_jupytext.ipynb

Root cause:

After reloading a saved csdid object, the internal data frame used in aggte_fnc/compute_aggte.py may contain a non-numeric column named rowid. During aggregation, the code groups the data and applies mean() across remaining columns, causing pandas to raise:

TypeError: agg function failed [how->mean, dtype->object]

Fix:

Before the aggregation step, the fix removes the rowid column if present:

if 'rowid' in data.columns:
data = data.drop(columns=['rowid'])

This restores correct execution of aggte() for both freshly computed and reloaded csdid objects, without affecting existing results.

@alexanderquispe alexanderquispe merged commit ecd128b into d2cml-ai:main Jan 13, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants